NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Advancing Chart Question Answering with Robust Chart Component Recognition

https://doi.org/10.1109/WACV61041.2025.00560

Zheng, Hanwen; Wang, Sijia; Thomas, Chris; Huang, Lifu (February 2025, IEEE)

Chart comprehension presents significant challenges for machine learning models due to the diverse and intricate shapes of charts. Existing multimodal methods often over-look these visual features or fail to integrate them effectively for Chart Question Answering. To address this, we introduce CHARTFORMER, a unified framework that enhances chart component recognition by accurately identifying and classifying components such as bars, lines, pies, titles, legends, and axes. Additionally, we propose a novel Question-guided Deformable Co-Attention (QDCAt) mechanism, which fuses chart features encoded by Chart-former with the given question, leveraging the question's guidance to ground the correct answer. Extensive experiments demonstrate a 3.2% improvement in mAP over the baselines for chart component recognition. For ChartQA and OpenCQA tasks, our approach achieves improvements of 15.4% in accuracy and 0.8 in BLEU score, respectively, underscoring the robustness of our solution for detailed visual data interpretation across various applications.
more » « less
Free, publicly-accessible full text available February 26, 2026
Debate as Optimization: Adaptive Conformal Prediction and Diverse Retrieval for Event Extraction

https://doi.org/10.18653/v1/2024.findings-emnlp.958

Wang, Sijia; Huang, Lifu (November 2024, Proceedings of the conference Association for Computational Linguistics Meeting)
Al-Onaizan, Y; Bansal, M; Chen, Y (Ed.)
We propose a multi-agent debate as optimization (DAO) system for event extraction, where the primary objective is to iteratively refine the large language models (LLMs) outputs through debating without parameter tuning. In DAO, we introduce two novel modules: the Diverse-RAG (DRAG) module and the Adaptive Conformal Prediction (AdaCP) module. DRAGsystematically retrieves supporting information that best fits the debate discussion, while AdaCP enhances the accuracy and reliability of event extraction by effectively rejecting less promising answers. Experimental results demonstrate a significant reduction in the performance gap between supervised approaches and tuning-free LLM-based methods by 18.1% and 17.8% on ACE05 and 17.9% and 15.2% on CASIE for event detection and argument extraction respectively.
more » « less
Full Text Available
Targeted Augmentation for Low-Resource Event Extraction

https://doi.org/10.18653/v1/2024.findings-naacl.275

Wang, Sijia; Huang, Lifu (June 2024, Association for Computational Linguistics)

Addressing the challenge of low-resource information extraction remains an ongoing issue due to the inherent information scarcity within limited training examples. Existing data augmentation methods, considered potential solutions, struggle to strike a balance between weak augmentation (e.g., synonym augmentation) and drastic augmentation (e.g., conditional generation without proper guidance). This paper introduces a novel paradigm that employs targeted augmentation and back validation to produce augmented examples with enhanced diversity, polarity, accuracy, and coherence. Extensive experimental results demonstrate the effectiveness of the proposed paradigm. Furthermore, identified limitations are discussed, shedding light on areas for future improvement.
more » « less
Full Text Available
RE2: Region-Aware Relation Extraction from Visually Rich Documents

https://doi.org/10.18653/v1/2024.naacl-long.484

Ramu, Pritika; Wang, Sijia; Mouatadid, Lalla; Rimchala, Joy; Huang, Lifu (June 2024, Association for Computational Linguistics)

Current research in form understanding predominantly relies on large pre-trained language models, necessitating extensive data for pre-training. However, the importance of layout structure (i.e., the spatial relationship between the entity blocks in the visually rich document) to relation extraction has been overlooked. In this paper, we propose REgion-Aware Relation Extraction (\bf{RE^2}) that leverages region-level spatial structure among the entity blocks to improve their relation prediction. We design an edge-aware graph attention network to learn the interaction between entities while considering their spatial relationship defined by their region-level representations. We also introduce a constraint objective to regularize the model towards consistency with the inherent constraints of the relation extraction task. To support the research on relation extraction from visually rich documents and demonstrate the generalizability of \bf{RE^2}, we build a new benchmark dataset, DiverseForm, that covers a wide range of domains. Extensive experiments on DiverseForm and several public benchmark datasets demonstrate significant superiority and transferability of \bf{RE^2} across various domains and languages, with up to 18.88% absolute F-score gain over all high-performing baselines
more » « less
Full Text Available
The Art of Prompting: Event Detection based on Type Specific Prompts

https://doi.org/10.18653/v1/2023.acl-short.111

Wang, Sijia; Yu, Mo; Huang, Lifu (August 2023, Association for Computational Linguistics)

We compare various forms of prompts to represent event types and develop a unified framework to incorporate the event type specific prompts for supervised, few-shot, and zero-shot event detection. The experimental results demonstrate that a well-defined and comprehensive event type prompt can significantly improve event detection performance, especially when the annotated data is scarce (few-shot event detection) or not available (zero-shot event detection). By leveraging the semantics of event types, our unified framework shows up to 22.2% F-score gain over the previous state-of-the-art baselines.
more » « less
Full Text Available
Morphace: An Integrated Approach for Designing Customizable and Transformative Facial Prosthetic Makeup

https://doi.org/10.1145/3519391.3519406

Wang, Sijia; Fang, Cathy Mengying; Yang, Yiyao; Lu, Kexin; Vlachostergiou, Maria; Yao, Lining (March 2022, Augmented Humans 2022)

Full Text Available

Search for: All records